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ABSTRACT 
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ABSTR A CT 



A User's Response to the Use of Listening 
Assessment instruments 



This paper examines five well known listening measurement devices from the viewpoint of 
a "user" of listening assessment instruments. The strengths, weaknesses, procedural 
problems, end conceptualizations of each are ccnsidered. Applications of each are suggested 
and future needs in the area of listening mrasurement and research are discussed. 
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Umzr'g 1?Ji«pM«MP- t p the Use of Listening JtfiS«stnc«t 

Instrmwnts 



Many studies have verified the premier finding of Renkin ( 1926) that listening is the most 
frequently Ubed mode of human communication. Indeed, almost every paper written concerning 
listening includes, early on, this insightful comment At first blush, one wonders, then, why it 
is that we have to keep stating the obvious and feel compelled to support the assertion with 
footnotes citing Rankin , Nichols, etc. The answer very well might be that simply saying 
something does not mean It will be heard, and simply hearing something does not mean it will 
have any effect upon the sulwequent behavior of the receiver. The receivers of our assertions 
are the members of our discipline. Despite our incessant pleading, our discipline's attention to 
listening does not mirror the voiced Impcrtance alluded to above. 

Pedagogical consideration of listening seemedto dramatically increase during the 1950'sand 
early 1950's, perhaps because of the availability of the Brown-Carlsen Listening Tast (Rrnwn 
ACarlsen, 1955). In the mid- 1960's, a number of criticisms of listening tests, and indeed of 
the whole conceptualization of listening, surfaced. Perhaps as a result of these criticisms the 
number of published studies of listening declined. There is less research on listening published 
in our journals now than there was in the 1 9S0's. This relative paucity of research is reflected 
in basic speech textbooks and, perhaps more critically, in listening texts. Two recent basic 
listening textbooks footnote es many studies done prior to 1 960 as the/ do studies done after 
1960 (SleilgLoL l983;Wolvin fiUl, 1982). Other scholarly works fare no better. In 1978, 
ERIC and SCA jointly issued Assessing Functional Communication ( Larson etal, 1 978) in which 
listening assessment wasdiscussed. Only one of the references cited in the article was written 
within five years of the nubl Icatlon date of the article while nine were written prior to 1 960. 

There are indications that this decline in scholarly attention is ending. A new organization, the 
International Listening Association, Is gaining strength. The business community is Increasing 
its emphasis on listening training. Perhaps most striking of all , the number of listening 
assessment devices available to the communication researcher and/or teacher is increasina The 
Brown-Car Iscn test remains the most popular assessment device, however both the 
Watson-Barker l-istgninq Test (Watson & Barker^ 1 984b) and the Kentucky Comprehensive 
ll$tgn1nqTgst.(Bostrom, 1 983) are gaining popularity. The Communication Comoetennv 
Cessment Instrument (Rubin, 1 982) has been utilized at a number of locations, and the 
LearnlnqSkills lnventory(Heun, Heun &Schnucker, 1977) approach to the assessment of 
comm unlcation skills has drastically changed the advising and teaching patterns of several 
universities. 

All of these inrtruments have many positive attributes. Though all manifest some weaknesses, 
all function veil when used for appropriate purposes. The remainder of this paper will focus on 
the uti llty c, the various Instruments and attempt to highlight problems that potential users 
should be aware of before selecting any one of the instruments. 
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Learning Skills Inventory 



A User's Response - page two 



The Learning Skills Inventory , unlike other instruments, does not purport to measure how well 
a person does communicate Rather, through a series of responses to 220 questldns, a learning 
skills profile is developed. These self-evaluations provide insights concerning how well the 
respofMtents thini they relate, symbolize, think, and remember. This self-report instrument 
also assesses the affect component of communication by asking questions concerning what the 
respondents "like to do." This is an important strength of the LearnlnoSkiVls Inventory. 
McCrosky ( 1 982) points out that assessment instruments that focus on skills typically address 
themselves to the question of whether or noi a person can do something, and not on whether the/ 
da do something. While it is important that people learn the required competencies and skills, 
he cogently poi nts out that the goal of instruction is "ultimate behavior." He suggests that 
failure to communicate competently may not alwa/s be related to lack of skills or competencies, 
"but rather may be the product of affective inhibition in people who are.both competent and 
skilled"(p.6). Different people need different training. "Some need to develop skills. Others 
need to alter their communicative orientations and feelings. Accurate diagrjosis should proceed 
Instruction. Confusing competence with performance and/or ignoring affect will lead to both 
inaccurate diagnoses and ineffective instruction" (McCrosky, 1 982, p. 7). Knowledge and skill 
are not enough to predict effectiveness. A person's motivation to do so must be entered into the 
equation as well. This is precisely the problem that the Learning Skills Inventory attpjnpts to 
solve. Respondents disclose wheit they perceive is their own relative development in eech of four 
major skill areas, broken down into fifty-five sub-skill areas. High, middle, and low strength 
skills are indicated. The reliability of this test has been extensively tested and is acceptable. 
When placed in computers, the test is easy to take and can be easily scored, with the results 
quickly dispensed to the respondent. 

Were users only provided this lest of perceived strengths and weaknesses, the usefulness of this 
data would vary greatly, depending upon the perspicacity of the test administrator Forti.inately, 
such is not the case. Along with the instrument, the developers provide the user with a 
"guidebook" that helps the user interpret the results of the test and, most importantly, 
formulate odaptive learning strategies. The data allow for mapping individualized 
alternative pathways for learning. At colleges where this instrument is employed, students use 
the Information to develop their own strategy for gathering information and obtaining meaning 
from that information. They end their advisore and tecchera uee this information when daciding 
on a program of skill enhancement Thus this information allows the student to learn most 
effectively, while at the same time indicating skill areas that need work. Thesame Instrument 
has been a help to teachers and business persons in assessing their own strengths and 
weaknesses and in developing strategies to make better use of the former and plan methods for 
dealing with the latter. 

There are some limitations associated with this instrument Knowing that they feel they are 
weak in listening might motivate students to take courses that depend primarily on textbook 
information. However , if no such option is available to the students, frustrations are increased 
Institutional users of this Instrument need to be committed to providing alternatives that answer 
these needs. Further , students who feel they are week in listening may also be motivated to take 
listening courses. The lack of such courses at many universities may further increase the 



5 BEST COPY AVAILABLE 



ERIC 



A User's Response - page three 



students' frustration. 

This instrument is not suited to the task of competency or skill assessment. As with most 
self-report instruments, it is highly fekeable. It can not be used to assess the changes in either 
competenc/ or skill levels. Nor can it be used to judge the relative competency or skill of two or 
more people. While the creators of tiie test rightly state that the test has face validity ( users 
are asked to respond to statements 1 ike "I am a good 1 lstener")for the purpose of discovering 
self-perceptions, no claim has been or can be mada concerning its validity as a measure of 
listening competence or skill. It has been my experience that some very fine listeners 
(subjectively judged by a panel of one expert- myself) have judged themselves weak and ma'.iy 
poor listeners ( located using the same unreliable method) have assessed themselves as beii .g 
strong in this area. This problem is not unique to this test. It may be that some people are poor 
judges of l>sir own abilities, or it may be that their working definition of concepts itke -yKx! 
listening" differ markedly from my own. Final ly , the test is administered in written form , iwl 
Is thus dependent upon the ability of the user to read effectively. Respondents need io be able ;o 
read and comprehend the test, a presupposition that we can no longer automatically make 
concerning our college students. While the vocabulary level should prove no barrier to the 
typical college student, the current trends of lowering ACT scores at open enrollment 
institutions and of increasing numbers of students in remedial classes ^muia not comfort the 
future user. Future versions may have to be administered orally. Such a procedure would 
allow attempts at explaining the various statements. This is an ootion of the Communimtinn 
Competency Assessment Instrwmmt and might be worthwhile if inter-user consistency is 
deemed desirable. 

Co mmunication Comnetenry Assessment instrtimon t 

The Communication Competency A.ssessment Instr ument fcCAl) shares one characteristic with 
the Learning Skills inventory in that both attempt to assess multiple skills within the seme 
testing procedure, but there the commonalities cease. Agrowing number of institutions use the 
£XJ&1 to assess a variety of communication skills of college students in order to determine if they 
have attained certain competency levels. Unlike the Learning Skills inventory, users of the CCAl 
are asked to perform certain communication taska Listenlngskillsaremeasured directly, 
without the use of "indirect paper/pencil inatruments" and " in aituations with which all 
students are familiar. Thus, the test was constructed around an educational context and provides 
measurements of how students communicate in classes, and with their professors and 
peers"( Rubin, 1 982 , p.xxi). Users are asked to attend to a videotaped presentation of a short 
lecture^ucb like one that would begin a college Listening course. To my knowledge, this Is the 
only assessment device that uses videotape. Though studies have not indicated that this channel is 
significantly more effective than others (i.e. audio- tape), it does increase my estimation of the 
face validity of the tesL I have few students who attend my words without also reaMvIng my 
gestures and other nonverbal , visual messages. Some research has indicated that the d/namism 
and trustworthiness of the speaker are related to the long term retention of information 
( Roberts, 1 980) and vlwlng a speaker delivering a speech gives clues about these two factors. 
Based on this evirience, one could argue both for nd against having video-taped presentations. 
My argument would be for the affirmative side, though 1 would like to see a variety of presenters 
in this formate rather than just one. Other listening tests have utilized several presenters to 
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offset the possibility of speaker/message/receiver interaction effects. Varying the sex, age, 
race, ethnic characteristics, etc. of the senders is prudent if overall listening competence is to 
be assessed. Though the same caution should be exercised with regwds to the message content, I 
like the fact that the content concerns listening. Even if the student lacks listening skills, the 
test itself becomes a "learning experience." The test's validity has been assessed " by having 
four of five communication faculty members agree on a blind placement of the stimulus 
questions into the correct competency areas (Rubin, 1982,p.xxii)". The creator of the test 
maintains *M the " test is valid and the items ore conceptually consistent" ( Rubin, 1 982, 
p. xxii). 

Another unique characteristic of the CC&l is the rel iance on the oral response mode rather than 
on the "indirect" paper and pencil responses of many other tests. Oiven the nature of the tesks 
students are asked to do, this is a significant strength. The problems of possibly measuring 
ineffective reading and writing skills are overcome. It does create some potential problems with 
rater reliability, but none so significant that they could not be overcome with training. 
However , students who are weak encoders could still demonstrate problems with the listening 
competencies even if they were not really deficient. Of particular concern are tfiose students 
with severe communication apprehension. The/ might not have any problem listening and 
recalling the information and would test out as superior students given a written response mode, 
but would be found deficient if tested using the oral response mode. Given that the intended use of 
this instrument is to identify communication problem areas, this could lead to a false diagnosis. 
Since I have used the listening sections of the test in isolation, I have had to be especially careful 
in this regard if users administer the total test they should be able to discern such problems 
during the encoding competency sections. The time element is a problem for those who are only 
interested in the 1 istening competency and still wish to make sure that the individual is not a 
high apprehensive. I am somewhat distressed by those who are calling for a written 
multiple-choice response mode for the lisleningsectionof the fiCAl( Rubin & Shepherd, 1985). 
While this would make large-scale testing situations practical , it would alter the nature of the 
task being asked of the students and, I believe, take away several of the truly unique and positive 
charecter istics of the CC&L 

TheffiM is designed to assess understanding and short-term recall as well as the user's ability 
to "recognize a fact in a class lecture or reporfC Rubin, 1 982). Testing ream rather than 
recognition Is one of the unique characteristics ot iiiis instrument. Most tests of listening 
ability are set up to test recognition rather than recall. The two skills are noi the same. Some 
theorists hold that recognition retrieves information from a different memory store (semantic 
store) than does recall (episodic store) (Cofer, 1975). This would not be a telling difference 
if, indeed, the test did assess short-term memory as is maintained ( Rubin, 1 982; Rubin & 
Shepherd, 1 985). Both semantic and episodic memory stores are conceptualized as long-term 
memory. However , after administering the test many times, it is clear to this user that the 
instrument is tapping long-term memory. The taped lecture runs for more than seven minutes 
and the questioning procedure clearly extends the period between stimulus presentation end 
recall beyond the (sually accepted duration of short-term memory (Gofer, 1975;Boslrom, 
1 985). This is not a severe conceptual flaw. I greeted this realization as a serendipitous 
happening. While certain situations do require effective short-term memory, many more 
require long-term retention. I find a measure of long-term memory recall ability much more 
useful than one that would demonstrate short-term recall or recognition ability. This Is 
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especially so for the listening context that this instrument focuses on - the classroom. That the 
£S:6l does measure long-term memory rather than short-term memory is demonstrated by the 
research of Rubin and Shepherd ( 1 985). The correlations with the "lecture listening" sections 
of several other tests and the lower and/or lack of correlation with the short-term sections are 
support for my view. 

Finally, I have not found the CGfli as suited for research as some of the other listening tests. The 
time factor, of course, weights heavily againsi its use. Further, its original intent was "to 
provides measure of students' readiness for upper-level college coursework" (Rubin, 1982, 
p. iv). The test Is not designed to be used as a measure ofcompetence for any one skill. This is 
just 8s well since some of the scoring seems to require a high level of rater expertise to insure 
relifebility. For example, in scoring "Competency *8," the rater would give the respondent a 
l^thrce" if the student asnwercd the question correctly, but did not "sound certain ," o score of 
"four" if the student gave the same answer, but this time tr/fA ccrtafniy , and a top score of 
"five" if and only if the student "sounds certain that she/he is identifying a fact and gives a brief 
reason why (referring to the stud/ on which it is baaed)" (Rubin, 1982, p.8). The question the 
student responds to seems to be a closed one. Hence students who follow instructions carefully 
are not given as much credit as those who go beyond the 1 imits of the question. Raters not only 
must reed nonverbals to ascertain "certainty," but also keep their L,;n nonverbals under control 
so es to al low for the student to freely add information without being prompted. When I use the 
test, I find it useful to ask students what they base their opinion on if they do not indicate this to 
me freely. Meny respond with the full and complete answer the highest score requires. 

Brown-Carlsen I isfpn ing Comprphension TP.<;f 

The Brown-Carlsen 1 isteninp OnmppRhftnainn Tftst w>w the first standardized listening test - 
though there were a few that predated it that were not standardized. Since it was first 
administered over 35 years ago, this test "has probably gone through more trials, revisions anrt 
refinements than most tests" (Brown, 1985). 

Brown ( 1 985) has demonstrated that this test measured a unique skill that does not correlate 
highly with reading ability(as measured by the Nelson-Denny Reading Test), intelligence (as 
measured by the ACE), or scholastic achievement (as indicated by grade point average and class 
rankings). Further efforts were made to establish validity "( 1 ) by definition, (2) by subtest 
interrelationships and (3) by subtest consistency" (Brown, 1985, p.2). All showed support 
for this operationallzatlon of 1 istenlng ability. Other tests indicated that the measure was both 
reliable and appropriate for high school and college students. 

One method for judging the "worth" of a measurement tool such as a listening test is to see how 
much it has been used. By that criterion alono, one would have to grant greet value to the 
Brown-Carlsftn I lstf>p|ng Onmprehenston Test. For more than thirty years, the mejorltyof 
researchers interested In studying listening have utilized this test. Of course, usage alone has 
nothing to do with the value of the findings gained through the administration of this instrument 
As Watson and Barker ( 1 981a) \ Jnt out, results of listening studies "are of limited value 
unless the instruments ere reliable and valid meesures of listening comprehension" (p. 1 87). 
As is, and should be, the case with almost every major measurement device, the reliability and 
validity of the instrument has been scrutinized. In the 1950's a number of critics looked 
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carefully at the the leading listening measurement instruments (See, for example, Becker, 
l963;Petrie, 1964; and Kelly, 1963, 1967). 

Perhaps the most important criticism of listening tests was leveled by Charles Kelly. Among his 
findings was that the Brown-Carlscn test and the STEP test correlated more highly with a test of 
intelligence than they did with each other. When judging the adequacy of this line of argument, it 
shotild be remembered that Brown presents evidence that his instrument does not correlate 
with measures of general intelligence. Nonetheless, modern critics of listening tests often cite 
Kelly's firming, interpreting it as an indictment of existing listening measurement techniques 
and a rationale for the Inditing of "more valid" tests of listening ability. I believe that this 
interpretation is not totally accurate. For me, the major thrust of his stud/ was to 
differentiate between listening performance and listening ability. To develop his 
argument, Kelly ( 1 963) cites Stromer( 1 952), Hackctt ( 1955), and others as he builds his 
research case . None of these sources question the i nternal validity of listening tests so much as 
they do the general izabllity of listening research. While Kelly does criticize both the internal 
and external validity of listening instruments, his attack on the latter is the more telling. He 
forcefully argues that "we have a massive bod/ of information about listening behavior of 
subjects who knew they were going to be tested This Is important information dealing with one 
type of listening activity. But we have done almost nothing to find out about performance across 
the general range of situations from panic to boredom" ( 1 967, p. 464). Kelly concludes that 
"our traditional procedures for testing listening are sterile, as customarily used, and that 
currently published 1 istening tests are not valid measures of a unique skill wjch as has been 
posited in much of the literature on listening" ( 1967, p. 455). 

It is interesting to note that though many have accepted his criticism as valid, the majority of 
the creators of modern listening assessment instruments continue to test for listening 
competency and skill in situations where anticipatory set would be assumed to be functioning. 
Under such situations, Kelly's finding that mental ability and listening ability are correlated is 
not unexpected. 

It is likelythat, as a subject's motivation to listen increases, the influence of 
mental ability upon his comprehension will also increase. I n other words, when 
a listener's attention is maximal ( as when taking a test), he probably makes 
full use of his mental ability to comprehend what is being presented, and his 
personality trails, past listening habits, etc. are relatively less important 
(Kelly. 1 967. p.464). 

It has been suggested that, rather than follow Kelly's suggestion and test for listening 
competency and skill outside of the "motivating environment," that we continue to control for 
motivation by alerting the test takers thus equalizing motivation. 

Our judgments of either competence or skill must be based on observations of overt 
behavior. Such judgments should be based on carefully controlled situations in 
which the person to be judged is aware that his/her competence/skill is to be 
observed end evaluated, and in circumstances In which the person is motivated to 
be perceived as competent or skilled. The typical classroom situation may provide 
such a setting. Under such circumstances it is possible to determine whether the 
person ^engage In the competent or skilled behavior. It Is not possible. 
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however, to judge whether the person m'llena^ in such behavior in later life. 
Both competence and skill are abilities which are mediated by motivations in 
everyda/ life and cannot be expected to be universally manifested in behavior 
under all circumstances (McCroslty, 1982, p. 7). 

The internal validity of the Brown-Carlsen test is another question. Even if limited to the 
motivating atmosphere of the testing situation, information concerning a subjects ability to 
listen would prove worthwhile. Whether the Brown-Carlsen instrument docs top listening 
ability is an important question. Many have suggested that listening is a complex process, 
rather than a simple, unidlmensional skill. This view is not disputed by the developers of the 
Brown-Carlsen test. Five subtests of the instrument measure the respondent's ability to 
follow directions, recognize transitions, recognize word meanings, and recall information 
immediately after it is presented and at a delayed time ( lecture comprehension). But is this the 
"complex of behaviors" that today's theorists call listening ? The answer to this question is an 
emphaticYES followed closely by a resounding MOI Indeed, theredoes not appear to be any 
clear cut cona-nsus concerning the 'correct ' definition of that elusive concept. As Weaver 
points out "standerdized tests... were built to meaaure skills their authors decided were the 
critical subskills, and no two tests measure the same behaviors"( 1976, p. 17). If any doubt 
this assertion, they need only attend to the reasoning of modern theorists coftcerning the 
COTceptualization of listening (See, for example, Bostrom, 1985; Watson & Barker, 1985). 

Differences in conceptualization and operationalization of variables abound in the literatura 
Such diversity is neither a surprise, nor a "curse." It allows for the emergence of the most 
robust theoretical explanation. As a result of the current "definitional jousting" several newer 
listening measurement devices hcve been created to take the place of the Brown-fapl;»>n 
Listfinino Comnrehfifwinn Tp^rt Partial justification for these instruments seems to Ije the 
questionable validity of the Brown-Carlsen test Given that other instruments were believed 
unsuitable for certain research efforts, and rather than cease doing research, at least two new 
instruments were created within the past five years to measure the complex concept that we call 
listening. 

Watson-RarkPr I igtPninq Tf>«^f 

One such Instrument Is the Watson-Barker Listening Tftst This test was developed In 1982 In 
an attempt to create a standardized listening test that would be oriented primarily toward adults 
and mature college level audiences. Its "face valiaity was assessed by using a panel of listening 
experts tojudgethevalldltyof each 1tem"(Walson&B8rker, 1984b,p. 1). Additional support 
for the validity of the instrument has been generated by Rubin awl Shepherd ( 1 985) and by 
Applegete and Campbell ( 1 985). Both studies link this instrument with other listening 
measurement tests. While such experiments will help to establish the efficacy of comparing 
data of the various tests, they provide only a tautological val idetion of the instruments. 1 f all 
tests are highly correlated and If any one test Is valid, the the validity claims of all tests can be 
accepted If no check of vol Idity other than that of "face validity" is performed, all such claims 
should be held in abeyance until the concept of "listening" is agreed upon substantively by 
listening theorists. Roberts ( 1 985) has presented some evidence that the Watson-Barker 
Listening Test does correlate in predictable ways to the RecRivRP Apprehension Test . Given the 
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amount of criticism directed at the validity of the Brown-Carlsen instrument, more studies need 
to be done before the majority of users accept the validity claims of this, or any newer 
conceptualization of listening measurement. 

Researchers can and do opcrationalize terms as the/ wish. If one accepts the operational ization, 
then there is no difficulty concerning utility. The Watson-Barker instrument conceptualizes 
listening as a combination of receivers ability to evaluate message content and emotional 
meaning in messages, understand meaning in ^vcrsations, understand and retain lecture 
information , and follow instructions and directions ( Watson & Barker , 1 985). I have used this 
instrument for several reasons. 1 feel that the sub-skills it taps are more in line with my 
research interests than some other instruments. It is easier and quicker to administer than 
other competing tests. Though I would prefer to tap recall as well as reagnition, I am willing to 
sacrifice this factor when the number of subjects I need to test is large. I have had the 
experience of coding the responses of over two hundred subjects on one immediate and three 
del8/ed recall tests of material presented in a naturalistic setting. I would have to be very 
interested in a project before I agreed to a similar opcrationalization in the future (or have 
several graduate students under my thumb). The CC&Ldoes not allow for the dairee of precision 
necessary for much of my statistical analyses, or it would provide an option that I might 
consider. 

I like the fact that most of the Watson-Barker stimulus material is capable of being generated in 
a non- laboratory setting. This helps defend its generallTability. A fitting test of the instrument 
would be to have subjects respond to questions that would mirror the content of this test under 
conditions of "nonawareness of the intent to lest and then to compare those data with scores 
generated "under testing conditions." The enormity of the effort required prompts me to 
unselfishly offer the task to anyone else who might care to have it. Watson and Barker are 
working on a video-taped version of their instrument. This should increase the "reality" of the 
stimulus material. 

There is some problem with "cheating" on the exam. Respondents can "look ahaad" to the 
quKtlons, gaining hints as to future questions. Though this Is not a necessary "flaw" in the test 
and hre more to do with its adiTf inistratlon , future users should be aware of this and control for 
it. Additionally, the version of the test that I am most familiar with does not contain a 
"distraction" segment (though more current versions with distractions are available). 
Nonetheless at the end of each page it became necessary to stop the tape, r.llowing people to turn 
the pages of the test booklet without hcving the disturbing sound of papers bother the reception 
of other test takers. 

While I have used the Watson-Barker test In research situations, I have found It more useful ds 
a "consciousness raiser" in my basic speech clasaea and in various seminars offered for 
incJuslry. It has proven a quick method for demonstrating the need of further listening training 
for both constituencies. The self-scoring answer sheet Is especially welcome. While using it In 
one seminar I did uncover a possible problem with some of the stimulus meterial. It seems that 
people who have experience with instructions similar to those presented on the tape ( i.e. -copy 
machines) have fevimr problems with those areas even though their ability to answer other 
questions in the same section is inadequate. Future users ma/ wish to evaluate this "history" 
component In their own respondents. 
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Kentucky Comprehensive Listening Test 

The Kentucky Comprehensive ListeninQ Teat ( KCn). like the Watson-Barker instrument wes 
created just a few years ago and is oriented towards adults. Like the Watson-Barker test, the 
KCLT has served to spur research activity these past several years. Again, like the 
Watson-Barker test, the KCLT is claimed by its developer to measure a variety of different 
sub-skills. The similarities continue in that both are mediated by other communication skills, 
and both ere delivered via audio tape. 

In terms cf usage, I have found the K£LI less amenable, in Its full form , for my research end 
"consciousness-raising" purposes because of its length. The lecture material was considered too 
drawn out and uninteresting by several groups. Possibly in rc^ioonse to experiences such as 
mine, Bostrom ( 1 984) has suggested that, for research project';, users dispense with the 
lecture section. My respondents also found the short-term retention tasfes unrealistic, and were 
prone to "cheat" by looking at the various possible answers before hand and developing methods 
for "remembering" the correct answer. 

In several Instances there has been en interesting response to the "distraction" section of the 
test. It ma/ be because of the naivete of my subjects, but for whatever reason, ttw/ were 
especially Interested In the twy distracting conversation that Is used to disrupt the listener's 
concentration. Those that demonstrated intense Interest in the distraction and could remember, 
with almost toU recall , the conversation, also scored rather well on that section. If this 
Budlo-tepad version Is ever converted into a video- tape, many will went to view the young lad/ 
In the role of the "distracter." 

Of crucial consideration for potential users are the dissimilarities between the 
Watson-Barker test and the KCLT. /.s Indicated previously, users choose measurement 
instruments partially on the basis of the underlying conceptual process (hat is operational ized 
by the test. For Bostrom that underlying process is, of course, listening. However, unlike 
Watson and Barker, he discounts several possible sub-processes that they suggest partially 
m€ke up listening, and concentrates on short-term memory, short-term memory with 
rehearsal, "lecture listening" (long-term memory), selective listening, and interpretive 
listening (Bostrom, 1985). He discounts the the sub-process of comprehension reasoning that 
"common sense tells us that we can listen without full understanding —in fact, often the 
question of understanding is irrelevant" (1985, p.4). As a user I am aware that people can 
listen without full understanding, indeed episodic long-term memory is conceptual Ized to be 
made up entirely of "nonverbal , nonsymbollc" Information. However, I also believe that we 
olor« information in ttomentic mwnory which , ae conceptualized of some, must include the 
concept of comprehension. Limiting retrieval tasks to "recognition" would make 
"comprehension" less necessary, but expanding the test to cover the full spectrum of memory 
tasks would demand its inclusion. 

Bostrom likewise has a very powerful argument against including the affect dimension in any 
consideration of listening. He points out that we can not specif/ any observable behavior that 
would al low an observer to ascertain when someone else is listening effectively "The real 
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poverty of an att itudinal approach is dramatized when one attempts to codify it into some 
concrete behaviors which will result In 'good* listening. Beyond acting Interested and not looking 
at one's watch while tha other is talking, there is 1 ittle to do to convince others that one is really 
listening'*(1985, p.6). From a user's standpoint such an argument is not persuasive. We often 
rely on self-report instruments. If we accept his viewpoint, what are we to do when we attempt 
to assess attitude change? Evidence seems to indicate that there are no necessarily conclusive 
behavioral indications that people l)el1cve or feel any partlculaj' wa/. Further, Is It so 
important for the listening teacher to instruct his cftarges on how to sem like they are good 
listeners, or is it more important for the teacher to help her pupils 6e good listeners? Tests 
1 Ike the learnlno Skills Inventory attempt to assess affect Perhaps elements of the two together 
might be combined to yield a richer assessment of both listening ability and performance - an 
outcome seemingly cnlled for by Kelly twenty years ago. 

While i do disagree with some of Bostrom's reasoning concerning the conceptualization of 
listening, I strongly agree with his feeling that people listen dlfferently ln different situations. 
I feel that listening is self-directed. People listen as the/ wish to, eithsr consciously or 
subconsciously. These desires are under their control - listening effectively is the listener's 
responsibility. Speakers can get people to listen, but only by persuading them that It will be 
useful for them to listen. For the sender the goal is to "get the receivers interested in listening." 
It is my understanding that Bostrom's functional approach would allow for this stance. 

He points out many advantages of this approach when he states that first 

it provides a comprehensive theoretical model based on fairly well-known memory 
functions; and fits well into private-public models of communicative behavior. 
Second, it provides a comprehensive answer to the problems originally raised by 
Kelly and ignored by researchers since the middle 1 960's. Third, it points to new 
directions in listening research, an area substantially ignored by researchers for a 
number of years (Bostrom, 1985, p. 16). 

Notwithstanding our different placing of emphasis concerning Kbily's critlcjue of listening 
measurement, his point is well taken. His analog/ to the various t/pes of listening is intriguing 
as wel 1 , as he wonders " ... if ardinary 'sending' beliavior is of many types, should not the 
receiving activity associated with it also vary?"( 1 985, p.8). A word of caution ma/ be 
required with regards to pursuing this line of investigation. Cronhkite ( 1 974) suggested that 
research be undertaken to investigate "variables that influence the audience's ability to reliably 
evaluate messages," and to "turn our existing spcaker-or iented research upside down to discover 
implications for critical listening" (pp. 81-82). Sprague ( 1 974), too, railed for the 
translation of "speaker-oriented, control-oriented theories and research findings Into 
receiver-oriented, choice expanding implicat1ons"(p. 83). While the pleas of such scholars for 
the creation of listening theories is persuasive, the successful translc'uon of sender theories 
Into receiver theories has not yet happened. It ma/ be thai it never will. Crucial to the success 
of such a venture is the implied linkage between encoding and decoding. If we are to flip these 
theories over so that they address themselves to effective listening rather than effective 
speaking, should we not first ascertain if there Is such a connection? So far such connections 
have been suggested by many, accepted axiomatically by some, and substantiated by no published 
research the*. ! hove discovered. The fulcrum that would allow us this Atlas-like task remains 
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elusive. We have yet to ascertain if listening and speaking "mirror" each other or "shadow" one 
another. Does one process reverse its opposite, repeat it, or are they totally different from each 
other? If they are reversible, then we con "turn our... reseerch upside down." But if the latter 
is the case, data derived theories of speaking "only" need be generalized "right-side up" to apply 
to listening situations. 

Currently a definition battle is being waged, albeit a refined, scholarly one, concerning the 
conceptualization of listening. Bostrom preaents ample evidence of the validity of his measure, 
as do all of the creators of listening measurement instruments. One such claim is that "each of 
the scales represents an actual instance of tte performance of the skill in question" ( Bostrom , 
1 984, p.2). With claims such as this, he and others seem to be attempting to avoid a full scale 
war by begging the question that while the definition of listening has not been agreed upon, the 
various sub-skills that the various tests measure have been universally accepted. If the 
"whole" has not been agreed upon, then the "parts" that make up the totality of that "whole" are 
no surer. While all , save the measurers of affect, would include "retention" within the 
conceptual framework of listening, what aspects of retention arc important and how the/ ar-e to 
be tapped is not so certain. 

Conclusions of a User 

It seems certain, to this user, that the debate as to which instrument is the most appropriate 
measure of listening competemy, skill , and/or performance will continue, it seems just as 
certain that the definition battle outlined above will be waged concurrently. One instrument can 
not win acceptance without its corresponding underlying conceptual definition being agreed to by 
the majority of users. Perhaps no (W£ INSTRUtlBNT will be found to be acceptable for all 
situations. In any case, we should not be upset by such jousting. Nor should we stand 
dispassionately aside as the "contest for acceptance" goes on. We users should enler in to the 
arena as "interested third parties" acting in the role, perhaps, of "devil's advocates." Subjecting 
these meesurement instruments to the fire of our scrutiny will result in much more sound 
listening test(s). The need for such a tool is evident. Without it , we can not hope to develop 
effective methods of 1 istening instruction. Given the rather sketchy evidence available, it is 
difficult to argue with Erway's ( 1 972) contention that gains from listening instruction are not 
maintained over time. Most listening research studies are "quick and dirty." Few longitudinal 
studies have been done. The generalizability of most studies ia severely limited by the nature of 
the subject population drawn upon. By far the most prevalent educational level in listening 
research is the elementary school level. Fewer studies have been carried out at the secondary 
level , and fewer still have been completed using college-ege subjects. This inverse relationship 
between the amount of studies and the age of subjects seems to mirror the relationship between 
age and potential for listening improvement that some researchers have alluded to in their 
articles (Evans, 1960;Evertts, 1962; Lieb, 1965). 

A close reading of most listening texts reveals that there is little reaaon to support the 
contention that we currently are teaching listening effectively For such support we continue to 
have to fall back upon the subjective judgments of other teachers of listening. Erwa/ ( 1 972) 
has suggested, "the most impressive evidence comes not from reseerch but from the prejudiced 
reports of students who have experienced instruction and from the observation of instructors" 
(p.23). This "evidence" must be considered especially suspect in light of the finding that people 
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tend to think more highly of themselves as listeners than test scores indicate and that they are 
less able to discriminate between good and poor listening than they are between good and poor 
speaking (Stark, 1 956). While It can be argued persuasively that we should teach listening at 
all educational levels, the only well documented listening finding is that listening is not being 
taught in most academic institutions. 

Implementing longitudinal investigations that would document effective methods for teaching 
listening would help to reverse this tendency towards lip service. If scarcity does increase the 
value of a commodity, the results of such studies done in the classroom situation would prove 
very worthwhile. Prior to 1 970, only fifteen empirical studies investigated pedagogical 
phenomena by first teaching teachers to behave in some particular way, then observing them to 
make sure the/ did behove In that way, and, finally, testing their students to note changes 
(Sprague, 1 974). As noted previously, there are profwunced problems in generalizing 
laboratory research to the classroom and beyond. What is lost in terms of ability to control and 
limit experimental artifacts would be made up for in terms of the vigor and power of the 
generallzabllity of the resultant data 

Until the majority of our field, who are interested in listening research, agree upon a definition 
of listening and instruments to tap that conceptualization, we users will not be able to proceed 
with our full attention to develop such experimental paradigms. Until such experiments are 
conducted, teachers Interested in increasing listening skills can do no better than rely on the 
unsubstantiated platitudes that currently make up the bulk of our listening instruction. We will 
continue to tell our students to "Withhold evaluation of the message until the speaker is finished" 
(Barker, 1984,p. 55) and hope they don't ask us too many questions about the research that 
indicates that that is appropriate behavior. There is no research documentation that would 
support such imperatives. One even could argue that such a course of action is inefficient since 
it causes you to listen to unimportant as well as senseless drivel. Further, even if that 
inefficiency were shown to be necessary and/or useful, no pedagogical direction is available that 
would allow a teacher to help students carry out that directive. How does one "withhold 
evaluation" on the attitudinal level? Does the evaluation only matter if done on the "conscious" 
level? Does it matter if people do evaluate a speaker, if they still continue to lis-ien to him? 

We need to develop measures that ere valid meosures of listening, regardless of where and under 
what circumstances that activity takes plaoo. Perhaps several instruments will be needed to 
cover all of the important contexts we wish to tap into. Expediency necessitates that we then 
undertake investigations to ascertain how we can best facilitate more effective listening. It may 
well be that our listening texts have more substance than alluded to above. If research reveals 
that there are founts of knowledge and potent developers of skills already extant, more weight 
can be applied in the effort to wedge in listening instruction In our already crowded curricula 
If none of our current teaching imperatives are supported, future research directions will be 
more clear and the weight of unsubstantiated dogma will no longer hove to be borne by listening 
instructors. Whichever the cose, we need to go forward. "As long as we lack such research we 
shall be bound to myths and superstitions which are interesting subject matter for our methods 
courses, but which have little relevance for the reel world "(Sprague, 1974). 
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